89 research outputs found

    A Linux Kernel Scheduler Extension for Multi-core Systems

    Full text link
    The Linux kernel is mostly designed for multi-programed environments, but high-performance applications have other requirements. Such applications are run standalone, and usually rely on runtime systems to distribute the application's workload on worker threads, one per core. However, due to current OSes limitations, it is not feasible to track whether workers are actually running or blocked due to, for instance, a requested resource. For I/O intensive applications, this leads to a significant performance degradation given that the core of a blocked thread becomes idle until it is able to run again. In this paper, we present the proof-of-concept of a Linux kernel extension denoted User-Monitored Threads (UMT) which tackles this problem. Our extension allows a user-space process to be notified of when the selected threads become blocked or unblocked, making it possible for a runtime to schedule additional work on the idle core. We implemented the extension on the Linux Kernel 5.1 and adapted the Nanos6 runtime of the OmpSs-2 programming model to take advantage of it. The whole prototype was tested on two applications which, on the tested hardware and the appropriate conditions, reported speedups of almost 2x.Comment: 10 pages, 5 figures, conferenc

    Compilation for heterogeneous SoCs : bridging the gap between software and target-specific mechanisms

    Get PDF
    International audienceCurrent applications constraints are pushing for higher computation power while reducing energy consumption, driving the development of increasingly specialized socs. In the mean time, these socs are still programmed in assembly language to make use of their specific hardware mechanisms. The constraints on hardware development bringing specialization, hence heterogeneity, it is essential to support these new mechanisms using high-level programming. In this work, we use a parametric data flow formalism to abstract the application from any hardware platform. From this premise, we propose to contribute to the compilation of target independent programs on heterogeneous platforms. These developments are threefold, with 1) the support of hardware accelerators for computation using actor fusion, 2) the automatic generation of communications on complex memory layouts and 3) the synchronization of distributed cores using hardware mechanisms for scheduling. The code generation is illustrated on a telecommunication dedicated heterogeneous soc

    Contrôle d'application flot de données pour les systèmes sur puces : étude de cas sur la plateforme Magali

    Get PDF
    International audienceLes applications embarquées demandent toujours plus de puissance de calcul pour moins de consommation, avec comme conséquence l'apparition de systèmes sur puces dédiés. Dans le domaine du traitement du signal, le modèle de calcul flot de données est couramment utilisé pour la programmation de ces systèmes sur puce. Il est donc nécessaire d'avoir un modèle d'exécution adapté à ces architectures et répondant aux contraintes applicatives. Dans ce tra- vail, nous proposons un nouveau modèle d'exécution pour le contrôle d'applications flot de données. Notre approche s'appuie sur les liens entre les caractéristiques des applications et les performances selon le modèle d'exécution associé. Ce travail est illustré avec une étude de cas sur la plateforme Magali

    Cognitive Radio Programming: Existing Solutions and Open Issues

    Get PDF
    Software defined radio (sdr) technology has evolved rapidly and is now reaching market maturity, providing solutions for cognitive radio applications. Still, a lot of issues have yet to be studied. In this paper, we highlight the constraints imposed by recent radio protocols and we present current architectures and solutions for programming sdr. We also list the challenges to overcome in order to reach mastery of future cognitive radios systems.La radio logicielle a évolué rapidement pour atteindre la maturité nécessaire pour être mise sur le marché, offrant de nouvelles solutions pour les applications de radio cognitive. Cependant, beaucoup de problèmes restent à étudier. Dans ce papier, nous présentons les contraintes imposées par les nouveaux protocoles radios, les architectures matérielles existantes ainsi que les solutions pour les programmer. De plus, nous listons les difficultés à surmonter pour maitriser les futurs systèmes de radio cognitive

    A Linux kernel scheduler extension for multi-core systems

    Get PDF
    ©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.The Linux kernel is mostly designed for multi-programed environments, but high-performance applications have other requirements. Such applications are run standalone, and usually rely on runtime systems to distribute the application's workload on worker threads, one per core. However, due to current OSes limitations, it is not feasible to track whether workers are actually running or blocked due to, for instance, a requested resource. For I/O intensive applications, this leads to a significant performance degradation given that the core of a blocked thread becomes idle until it is able to run again. In this paper, we present the proof-of-concept of a Linux kernel extension denoted User-Monitored Threads (UMT) which tackles this problem. Our extension allows a user-space process to be notified of when the selected threads become blocked or unblocked, making it possible for a runtime to schedule additional work on the idle core. We implemented the extension on the Linux Kernel 5.1 and adapted the Nanos6 runtime of the OmpSs-2 programming model to take advantage of it. The whole prototype was tested on two applications which, on the tested hardware and the appropriate conditions, reported speedups of almost 2x.This project is supported by the European Union’s Horizon 2021 research and innovation programme under the grant agreement No 754304 (DEEP-EST), the Ministry of Economy of Spain through the Severo Ochoa Center of Excellence Program (SEV-2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P) and by the Generalitat de Catalunya (2017-SGR-1481).Peer ReviewedPostprint (author's final draft

    Efficient Encoding of SystemC/TLM in Promela

    Get PDF
    International audienceTo deal with the ever growing complexity of Systems-on-Chip, designers use models early in the design flow. SystemC is a commonly used tool to write such models. In order to verify these models, one thriving approach is to encode its semantics into a formal language, and then to verify it with verification tools. Various encodings of SystemC into formal lan- guages have already been proposed, with different performance implications. In this paper, we investigate a new, automatic, asynchronous means to formalize models. Our encoding supports the subset of the concurrency and communication constructs offered by SystemC used for high-level modeling. We increase the confidence in the fact that encoded programs have the same semantics as the original one by model-checking a set of properties. We give experimental results on our formalization and compare with previous works

    Incremental checkpointing of program state to NVRAM for transiently-powered systems

    Get PDF
    International audienceAs technology improves, it becomes possible to design autonomous, energy-harvesting networked embedded systems, a key building block for the Internet of Things. However, running from harvested energy means frequent and unpredictable power failures. Programming such Transiently Powered Computers will remain an arduous task for the software developer, unless some OS support abstracts energy management away from application design. Various approaches were proposed to address this problem. We focus on checkpointing, i.e. saving and restoring program state to and from non-volatile memory. In this paper, we propose an incremental checkpointing scheme which aims at minimizing the amount of data written to non-volatile memory, while keeping the execution overhead as low as possible

    A QoS Monitoring System for Dataflow Programs

    Get PDF
    National audienceWith the generalization of multi-core processors, dataflow programming is regaining a strong interest, especially in the context of compute intensive multimedia applications such as video decoding. How- ever, most studies focus on static approaches to the compilation and placement problems. We advocate for dynamic adaptation of dataflow applications. In this paper, we build the first step towards this goal, namely a monitoring mechanism for observing quality-of-service properties of programs at run- time. We propose a language extension for expressing simple QoS properties over dataflow programs together with a run-time mechanism for the observation of events meaningful to the QoS establishment. We show the limited impact of such mechanisms on the application overall performances

    MPU-based incremental checkpointing for transiently-powered systems

    Get PDF
    International audienc
    corecore